在元加强学习(META RL)中,代理商从一组培训任务中学习如何快速解决从相同的任务分布中绘制的新任务。最佳的元rl政策,又称贝叶斯最佳行为,是很好的定义,并保证了对任务分布的预期最佳奖励。我们在这项工作中探讨的问题是,需要多少培训任务来确保具有很高可能性的大致最佳行为。最近的工作为无模型设置提供了第一个这样的PAC分析,其中从培训任务中学到了依赖历史的政策。在这项工作中,我们提出了一种不同的方法:使用密度估计技术直接学习任务分布,然后对学习任务分布进行培训。我们表明,我们的方法导致界限取决于任务分布的维度。特别是,在任务分布中处于低维多方面的环境中,我们将分析扩展到使用降低性降低技术并说明这种结构,从而比以前的工作明显更好,这严格取决于状态和行动的数量。我们方法的关键是内核密度估计方法所隐含的正则化。我们进一步证明,当“插入”最先进的Varibad Meta RL算法时,这种正则化在实践中很有用。
translated by 谷歌翻译
我们提出了一个新的视觉数据表示形式,该数据将对象位置从外观上删除。我们的方法称为深潜粒子(DLP),将视觉输入分解为低维的潜在``粒子'',其中每个粒子都用其周围区域的空间位置和特征来描述。为了学习这种表示形式,我们遵循一种基于VAE的方法,并根据空间 - 软构建结构引入了粒子位置的先验位置,并修改了受粒子之间倒角距离启发的证据下限损失。我们证明,我们的DLP表示形式可用于下游任务,例如无监督关键点(KP)检测,图像操纵和针对由多个动态对象组成的场景的视频预测。此外,我们表明,我们对问题的概率解释自然提供了粒子位置的不确定性估计,可用于模型选择以及其他任务。可用视频和代码:https://taldatech.github.io/deep-latent-particles-web/
translated by 谷歌翻译
学习机器人技能通常被称为SIM2REAL的实用方法是培训仿真中的控制政策,然后在真正的机器人上部署它们。流行的技术改进域随机化的SIM2REAL转移构建(DR):培训对各种随机产生的域的政策,希望能够更好地推广到现实世界。由于策略学习和DR算法中的大量超参数,一个经常最终有大量训练有素的型号,在那里选择最佳模型需要对真正的机器人进行昂贵的评估。在这项工作中,我们问:我们可以在没有在现实世界中跑步的情况下对政策进行排名吗?我们的主要思想是,可以使用预定义的真实世界数据来评估所有策略,使用分配检测(OOD)技术。从某种意义上说,这种方法可以被视为“单位测试”,以评估任何真实世界的执行前的政策。然而,我们发现本身,ood得分可能对特定的ood方法非常敏感。我们的主要贡献是一个简单尚有效的政策分数,在模拟中结合了ood。我们表明我们的得分 - VSDR - 可以显着提高政策排名的准确性,而无需额外的现实数据。我们评估VSD对具有图像输入的机器人抓握任务中SIM2REAL转移的有效性。我们广泛地评估不同的DR参数和ood方法,并显示VSDR改善了电路板上的政策选择。更重要的是,我们的方法达到了更好的排名,与基线相比使用显着更少的数据。
translated by 谷歌翻译
在异常检测(AD)中,给出了识别测试样本是否异常,给出了正常样本的数据集。近期和有希望的广告方法依赖于深度生成模型,例如变形自动化器(VAES),用于对正常数据分布的无监督学习。在半监督广告(SSAD)中,数据还包括标记异常的小样本。在这项工作中,我们提出了两个用于SSAD培训VAES的两个变分方法。两种方法中的直观思路是将编码器训练到潜在向量之间的“分开”以进行正常和异常数据。我们表明,这个想法可以源于问题的原则概率制剂,并提出了简单有效的算法。我们的方法可以应用于各种数据类型,因为我们在从自然图像到天文学和医学的SSAD数据集上展示,可以与任何VAE模型架构相结合,并且自然与合奏相兼容。与未特定于特定数据类型的最先进的SSAD方法比较时,我们获得了异常值检测的显着改进。
translated by 谷歌翻译
For many applications of reinforcement learning it can be more convenient to specify both a reward function and constraints, rather than trying to design behavior through the reward function. For example, systems that physically interact with or around humans should satisfy safety constraints. Recent advances in policy search algorithms (
translated by 谷歌翻译
Spurious correlations in training data often lead to robustness issues since models learn to use them as shortcuts. For example, when predicting whether an object is a cow, a model might learn to rely on its green background, so it would do poorly on a cow on a sandy background. A standard dataset for measuring state-of-the-art on methods mitigating this problem is Waterbirds. The best method (Group Distributionally Robust Optimization - GroupDRO) currently achieves 89\% worst group accuracy and standard training from scratch on raw images only gets 72\%. GroupDRO requires training a model in an end-to-end manner with subgroup labels. In this paper, we show that we can achieve up to 90\% accuracy without using any sub-group information in the training set by simply using embeddings from a large pre-trained vision model extractor and training a linear classifier on top of it. With experiments on a wide range of pre-trained models and pre-training datasets, we show that the capacity of the pre-training model and the size of the pre-training dataset matters. Our experiments reveal that high capacity vision transformers perform better compared to high capacity convolutional neural networks, and larger pre-training dataset leads to better worst-group accuracy on the spurious correlation dataset.
translated by 谷歌翻译
Image segmentation is a fundamental task in computer vision. Data annotation for training supervised methods can be labor-intensive, motivating unsupervised methods. Some existing approaches extract deep features from pre-trained networks and build a graph to apply classical clustering methods (e.g., $k$-means and normalized-cuts) as a post-processing stage. These techniques reduce the high-dimensional information encoded in the features to pair-wise scalar affinities. In this work, we replace classical clustering algorithms with a lightweight Graph Neural Network (GNN) trained to achieve the same clustering objective function. However, in contrast to existing approaches, we feed the GNN not only the pair-wise affinities between local image features but also the raw features themselves. Maintaining this connection between the raw feature and the clustering goal allows to perform part semantic segmentation implicitly, without requiring additional post-processing steps. We demonstrate how classical clustering objectives can be formulated as self-supervised loss functions for training our image segmentation GNN. Additionally, we use the Correlation-Clustering (CC) objective to perform clustering without defining the number of clusters ($k$-less clustering). We apply the proposed method for object localization, segmentation, and semantic part segmentation tasks, surpassing state-of-the-art performance on multiple benchmarks.
translated by 谷歌翻译
Training a generative model on a single image has drawn significant attention in recent years. Single image generative methods are designed to learn the internal patch distribution of a single natural image at multiple scales. These models can be used for drawing diverse samples that semantically resemble the training image, as well as for solving many image editing and restoration tasks that involve that particular image. Here, we introduce an extended framework, which allows to simultaneously learn the internal distributions of several images, by using a single model with spatially varying image-identity conditioning. Our BlendGAN opens the door to applications that are not supported by single-image models, including morphing, melding, and structure-texture fusion between two or more arbitrary images.
translated by 谷歌翻译
Latent variable models such as the Variational Auto-Encoder (VAE) have become a go-to tool for analyzing biological data, especially in the field of single-cell genomics. One remaining challenge is the interpretability of latent variables as biological processes that define a cell's identity. Outside of biological applications, this problem is commonly referred to as learning disentangled representations. Although several disentanglement-promoting variants of the VAE were introduced, and applied to single-cell genomics data, this task has been shown to be infeasible from independent and identically distributed measurements, without additional structure. Instead, recent methods propose to leverage non-stationary data, as well as the sparse mechanism shift assumption in order to learn disentangled representations with a causal semantic. Here, we extend the application of these methodological advances to the analysis of single-cell genomics data with genetic or chemical perturbations. More precisely, we propose a deep generative model of single-cell gene expression data for which each perturbation is treated as a stochastic intervention targeting an unknown, but sparse, subset of latent variables. We benchmark these methods on simulated single-cell data to evaluate their performance at latent units recovery, causal target identification and out-of-domain generalization. Finally, we apply those approaches to two real-world large-scale gene perturbation data sets and find that models that exploit the sparse mechanism shift hypothesis surpass contemporary methods on a transfer learning task. We implement our new model and benchmarks using the scvi-tools library, and release it as open-source software at \url{https://github.com/Genentech/sVAE}.
translated by 谷歌翻译
近年来,隐含的生成模型(例如生成对抗网络和扩散模型)已变得普遍。虽然这些模型确实显示出了显着的结果,但评估其性能是具有挑战性的。这个问题对于推动研究并从随机噪声中确定有意义的收益至关重要。当前,启发式指标(例如INCEPTION评分(IS)和特雷希特(Frechet Inception)距离(FID)是最常见的评估指标,但是它们所测量的内容尚不完全清楚。此外,关于他们的分数实际有多有意义的问题。在这项工作中,我们通过生成高质量的合成数据集来研究生成模型的评估指标,我们可以在该数据集中估算经典指标以进行比较。我们的研究表明,尽管FID和与几个F-Diverence确实相关,但它们的近距离模型的排名可能会差异很大,因此在用于Fain Graining比较时,它们有问题。我们进一步使用了这种实验环境来研究哪些评估度量与我们的概率指标相关。最后,我们研究用于FID等指标的基本功能。
translated by 谷歌翻译